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Abstract 

A data parallel machine represents an array or other composite data 
structure by allocating one processor (at least conceptually) per data 
item. A pointwise operation can be performed between two such arrays 
in unit time, provided their corresponding elements are allocated in the 
same processors. 

If the arrays are not aligned in this fashion, the cost of moving one 
or both of them is part of the cost of the operation. The choice of 
where to perform the operation then affects this cost. If an expression 
with several operands is to be evaluated, there may be many choices 
of where to perform the intermediate operations. 

We give an efficient algorithm to find the minimum-cost way to 
evaluate an expression, for several different data parallel architectures. 

Our algorithm applies to any architecture in which the metric describ- 
ing the cost of moving an array has a property we call “robustness.” 

This encompasses most of the common data parallel communication 
architectures, including meshes of arbitrary dimension and hypercubes. 

We remark on several variations of the problem, some of which we solve 
and some of which remain open. 

Keywords: data parallel architecture, compilers for parallel comput- 
ers, Steiner trees. 
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1 Introduction 


1.1 The problem 

Massively parallel architectures are an emerging fact in scientific computing. 
They offer very high peak computation rates but in general they must be 
carefully programmed to avoid performance bottlenecks that can be caused 
by poor load balancing or by excessive interprocessor or memory traffic. Sev- 
eral such architectures provide especially fast communication paths between 
neighboring processors on a grid of one, two, or more dimensions. Commer- 
ically available examples include such SIMD machines as the MPP, AMT 
DAP, Connection Machine, and MasPar MP-1, and such MIMD message- 
passing machines as those from Intel and Ncube. 

We shall be concerned in this paper with the following problem: Given 
n arrays of data A \ y . - . , A n , all of the same shape, we wish to evaluate an 
arithmetic expression involving these arrays, arbitrary binary operations (to 
be applied elementwise), and parentheses. We assume that the arrays are 
situated at various places in a multicomputer consisting of processors with 
local memory, connected in some regular fashion. A given metric describes 
the cost of moving an array from one position to another within the ma- 
chine. The problem is to determine the positions at which to carry out the 
individual operations in the expression. 

Figure 1 is an example for a two-dimensional grid of processors. Four 
arrays in, x, y, and z occupy different parts of a two-dimensional grid of 
processors. Define the “position” of an array to be the position of its upper 
left, or (1, 1), element. We want to evaluate the expression 

(wffix)® (j/0z), 

where ©, ®, and Q are operations that act elementwise on the array. Sup- 
pose that moving an entire array one position north, south, east, or west has 
unit cost. Then the metric is the two-dimensional /i “Manhattan metric.” 
We could for example move all the arrays to the position of array in at a cost 
of 185 (30 for x, 90 for y, and 65 for z)\ or we could move w to x’s position 
and perform © (cost 30), move y and z to position (30,60) and perform © 
(cost 80), and move that result to x’s position to perform ® (cost 35), for a 
total cost of 145. 

In this paper we give an efficient algorithm to find the minimum-cost 
evaluation procedure for an arbitrary expression, provided the metric sat- 
isfies a condition that we call robustness. Among the cases in which our 
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Figure 1: Four arrays are to be combined in the expression (w ® x) ® {y O 
z). The upper left-hand element of w is at (10,80), of x at (30,90), of y 
at (80,60), and of z at (25,30). 
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results apply are the l\ metric in any number of dimensions, the l Q 0 metric 
in two dimensions, the discrete metric, and the Hamming distance metric 
describing shortest distance in a hypercube. We expect that the chief use 
of these results will be in compiler and run time optimization for massively 
parallel machines. 

This tree embedding problem one of a large class of so-called Steiner 
problems, which have applications in many areas [8]. The work most closely 
related to ours is in the context of describing an evolutionary tree in biology. 
Several specific metrics have been studied, including /i [l], the discrete met- 
ric [2, 4], and a string mutation distance metric [7]. Our results on robust 
metrics include those of [l, 2, 4] as special cases. 

Several variations of the problem are of interest, including the following. 

• A position may be specified in which the final result is to be placed. 
As described in Section 5, this variant reduces easily to the case where 
the final result position is free. 

• Various metrics are of interest for different architectures. The most 
realistic metrics include the one-dimensional Euclidean metric, higher- 
dimensional /i and /qo metrics, the hypercube metric, and some com- 
binations of these metrics with discrete metrics. 

• We could distinguish between arrays stored by rows and arrays stored 
by columns, and include the cost of any necessary transpositions. More 
generally we could include the possibility of translating among several 
alternative representations of a data structure. This does not change 
the problem; it just makes the metric a bit more complicated. 

• We could allow associative and/or commutative rearrangement of the 
expression tree. As described below, this makes the problem easier in 
one dimension and harder in higher dimensions. 

• We could take common subexpressions into account, possibly even 
allowing copies of the arrays to be left in strategic positions during a 
move. This is probably very hard. 

1.2 Definitions 

We are given a universe P of possible positions (corresponding to processors), 
and a function d that describes the cost of moving data from one position 
to another. The function d is a metric: 
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• d(p,q) = d(q,p ) > 0. 

• d(p, q) = 0 if and only if p = q. 

• d(p, q) + d(q,r) > d(p,r). 

If X and Y are sets of positions then d(X,Y) = inf{d(x,y) : x € X,y £ Y}. 
(If P is finite we can replace inf by min.) The triangle inequality implies 
that d(X,p) + d(p,Y) > d(X,Y) if p is a position and X and Y are sets of 
positions. 

Now we make several definitions that allow us to identify a class of 
metrics for which we can efficiently find minimum-cost tree embeddings. 

The generalized intersection of two sets X and Y of positions is defined 
as 

X n Y = {p e P | d(p,X) + d(p,Y) = d{X y Y )}, 

If X CiY is nonempty and closed then X nT = X nK . If X and Y are disjoint 
intervals on the real line and d is Euclidean distance then X Hi 7 is the closure 
of the set of points between X and Y . Figure 2 shows three examples of 
generalized intersections in the /i metric in two dimensions, using as data 
the positions of the upper left corners of the arrays in Figure 1. 

For an arbitrary metric cf, we define sets of positions called generalized 

intervals in terms of n as follows: A generalized interval is either a set 
containing a single position, or the generalized intersection of two generalized 
intervals. Thus, for example, the generalized intervals on the Euclidean real 
line are the nonempty closed bounded intervals. We will need generalized 
intervals to be nonempty and compact. This is true whenever P is finite 
(which is the only realistic case), or indeed whenever P is a finite-dimensional 
complete normed vector space. 

A metric d is called robust if all generalized intervals are nonempty and 
compact and 

d(pj) + d(p, J) > d(pJnJ) + d(I y J) 

holds for all positions p and all generalized intervals I and J. For example, 
the Euclidean metric on the real line is robust. We will see that the Euclidean 
metric h in two dimensions is not robust, although the /1 and loo metrics in 
two dimensions are robust. 

The following lemma is easy to verify. 

Lemma 1 The sum of two robust metrics is a robust metric. | 
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Figure 2: Generalized intersections in the l\ metric on a two-dimensional 
grid. Region A is {iu} n {x}, region B is {y} O {z} y and region C is An B. 

We conclude with some definitions concerning embeddings of a tree into 
a metric space. 

Let T be a rooted binary tree in which each node is either a leaf with no 
children or an internal node with two children. We will use lower-case italic 
x, y, . . . for nodes. 

If x is a node in a tree T y then T(x) is the subtree of T rooted at x. An 
embedding of T is a choice of a position *(x) for every node x of T. The cost 
of an embedding tt of T is cost(?r,r), which we will write as cost(T) when 
the embedding is implicit. It is defined as 

COSt(7T,T)= d (*( X )My))- 

edges {*♦*} of T 

In our problems the positions of the leaves of T are fixed. The cost 
of a minimum-cost embedding of T subject to those fixed leaf positions is 
mincost(T). The leaf positions are always fixed in the sequel, and we will 
not mention the fact explicitly again. 
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If i is a node of T, we define Opt(x) as the set of possible values of x(x) 
in minimum-cost embeddings of T’(z). That is, 

Opt(z) = {p € P | 3 a min-cost embedding x of T(x) with x(x) = p }. 

Notice that Opt(z) minimizes only the cost of the subtree rooted at x, not 
all of T. 

2 Robust metrics 

In this section we present an efficient algorithm for constructing minimum- 
cost embeddings for robust metrics. Throughout the section, we consider 
the variant in which the final result position is free, and commutative and 
associative rearrangement of the expression are not allowed. 

The main theorem about robust metrics says that every subtree has a 
locally optimal embedding that can be extended to an embedding of the 
entire tree. 

Theorem 2 Let d be a robust metric on a universe P of positions. Let T be 
a binary expression tree with positions specified for its leaves. The following 
are true for any internal node x of T with children y and z. 

1. Opt(z) = Opt(y) n Opt(z). 

2. Every embedding ir ofT(x) satisfies 

cost(x,T(x)) > mincost(T(x)) + d(x(x),Opt(x)). 

3. For all p € Opt(x) there is a minimum-cost embedding x ofT(x) with 
x(x) = p, x(y) 6 Opt(y), and x(z) G Opt(z). 

Proof. We induct on the size of T(x). If the root is a leaf there is 
nothing to prove. For the inductive step, let x be an internal node with 
children y and z. Let G = Opt(y) n Opt(z), let c„ = mincost(T(y)), and let 
c, = mincost(T(z)). Note that G is nonempty because d is robust and, by 
the inductive hypothesis, Opt(y) and Opt(z) axe generalized intervals. 

For any embedding x of T(z), we have 

cost(T(x)) = d(x(x),x(y)) + cost(T(y)) + d(x(z),x(z)) + cost(T(z)). 
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Applying part (2) of the inductive hypothesis to T(y) and T(z ), and then 
using the triangle inequality, we get 

cost(T(x)) > <f(x(x),Opt(y)) + c v + d(jr(x),Opt(z)) + c x . 

Because d is robust we have <f(x(x), Opt(y))+d(x(i), Opt(z)) > d(n(x), G) + 
<f(Opt(y), Opt(z)). Therefore 

cost(T(x)) > c y + c x + d(Opt(y),Opt(z)) + d(ir(x),G). (1) 

Now consider any p x G G. Since Opt(y) and Opt(z) are compact, there 
are positions p v G Opt(y) and p x G Opt(z) with d(p x ,p y ) = d(p x ,Opt(y)) 
and d{p x ,p z ) = d{p x , Opt(z)). By the induction hypothesis there are em- 
beddings of T(y) and T(z) of cost Cy and c x with ir(y) = p v and jt(z) = p x . 

Because G = Opt(y) fl Opt(z), we have d(p x , Opt(y)) + d{p x , Opt(z)) = 
d(Opt(y), Opt(z)). If x(z) = p x , then, we have 

cost(T(x)) = c y + c x + <i(Opt(y),Opt(z)). 

Inequality (1) says that we cam do no better, and that if ir(x) ^ G then 
cost(T(x)) > Cy + c x + d(Opt(y),Opt(z)). Therefore mincost(T(x)) = 
c y + c x + <f(Opt(y), Opt(z)) and G = Opt(i), proving conclusion (1). Then 
inequality (1) is exactly conclusion (2). Conclusion (3) follows from the 
construction, finishing the proof. | 

We can find a minimum-cost embedding in linear time by using conclu- 
sions (1) and (3) of this theorem, provided we can efficiently compute the 
generalized intersection of two generalized intervals and find the point of a 
generalized interval closest to a given position. 

Here is the algorithm for the general case (assuming a robust metric). 
The input is a tree T with root r, and specified values ir(x) for every leaf x. 
The output is an optimal choice of Jr(x) for every internal node of T. Sec- 
tion 4 describes the algorithm in more detail for some specific metrics. 

Step 1. Set Opt(x) = (ir(x)} for each leaf x. 

m 

Step 2. Traverse the tree in postorder, computing Opt(x) = Opt(y) n 
Opt(z) for each internal node with children y and z. 

Step 3. Set ?r(r) to be any element of Opt(r). 

Step 4. Traverse the tree in preorder, computing * as follows: Place the 
child y of x at the point of Opt(y) closest to *(x). 
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3 Specific metrics 


Here we consider several possible metrics. Throughout the section, we con- 
sider the variant in which the final result position is free, and commutative 
and associative rearrangement of the expression are not allowed. 

If the set of positions is fc-dimensional space, the l v metric (for integer 

p > 0) has 





I - yi\ v 



In the In metric, the distance between x and y is max* |x,- - y,|. In the 
discrete metric , the distance between x and y is 0 if x = y, or 1 if x ^ y. 


One dimension. In one dimension on either the real line or (more real- 
istically) a finite interval of the integers, all the l p metrics are the 
same as normal Euclidean distance. This is a realistic metric for one- 
dimensional processor arrays. As mentioned in Section 2, this metric 
is robust and the algorithm gives a minimum-cost embedding in time 
linear in the size of the tree. 


The l\ metric in more than one dimension. This is a realistic metric 
for a grid of processors with connections to their nearest neighbors. 
The metric is the sum of the one-dimensional h distances along each 
coordinate axis, so Lemma 1 implies that it is robust. In fact, the prob- 
lem separates into an independent one-dimensional Euclidean problem 
for each coordinate. For fixed dimension, therefore, the optimal layout 
can be found in linear time. Section 4 gives a detailed algorithm. 


The loo metric in more than one dimension. This is a realistic metric 
for a grid of processors with connections to their nearest neighbors 
and also to their diagonal neighbors. In two dimensions, for example, 
this is realistic for a nine-point mesh of processors. The loo metric is 
robust in two dimensions — in fact it is just a rotation and scaling of 
the h metric, so minimum-cost embeddings can still be found in linear 
time. Section 4 gives a detailed algorithm. We do not know whether 
loo is robust in more than two dimensions or not. 

The I2 metric in more than one dimension. This is normal Euclidean 
distance, which is not realistic for any existing processor architec- 
ture. The metric is not robust in two or more dimensions. For a 
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two-dimensional example, consider I = {(0,1)}, J - {(0,-1)}, an(1 
p - (1,0). Then In J is the closed interval on the y axis from (0, 1) to 

(0, —1). We have d(p, I) + d(p, J ) = 2\/2 < 3 = d(p, I n J) + d( J, J)* 
This example also shows that Theorem 2 fails to hold for /2* If y is 
the parent of leaves fixed at I and J and x is the parent of y and a 
leaf fixed at p, then Opt(y) = I n J but there is no minimum-cost 
embedding of T(x) with *(y) G Opt(y). Using a different approach 
based on work of Melzak [6], Hwang [5] gives a linear- time algorithm 
to find a minimum-cost embedding in the metric in two dimensions. 

The hypercube metric. Here the positions are vertices of a hypercube 
and the metric is Hamming distance, or shortest path length in the 
hypercube. This metric is robust, although representing a generalized 
interval requires one datum for each dimension so the running time 
of the algorithm increases with dimension. This problem is just the 
/j- metric problem for a space of the dimension of the hypercube. Thus 
the hypercube-metric problem (without rearrangement) can be solved 
in 0(nk ) time, where n is the size of the tree and k is the dimension 
of the cube. 

The discrete metric. This metric is realistic for a single processor if the 
data structure can be represented in several different ways and there 
is a fixed cost to translate from one to another. The metric is ro- 
bust. Every nonempty set is a generalized interval. The generalized 
intersection A Hi? is An B if An B is nonempty, or A U B otherwise. 

Transposing arrays. As an example of a more complicated metric that is 
still robust, suppose that arrays may be stored either by rows or by 
columns. Then some operands must be transposed as well as moved. 
We can include this possibility in any of these metrics if the cost of 
transposing simply adds to the cost of moving. The discrete metric 
with two positions w by rows” and w by columns” is robust, so adding 
it to a robust metric still gives a robust metric. 

The power-of-two-news metric. Here the positions are a ^-dimensional 
grid, and the metric is the length of a shortest path whose step lengths 
are all powers of two. This metric may be realistic in some cases, such 
as the u power-of-two news” moves in the Connection Machine. We do 
not have any results for it. 
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Toroidal metrics. We could extend the h metrics to the torus, by allowing 
the extremes of a ^-dimensional grid of processors to be connected. We 
do not have any results for this metric. 

String mutation distance. Here the positions are finite strings of sym- 
bols over a finite alphabet. The distance between two strings is the 
smallest number of one-symbol deletions, insertions, and substitutions 
required to transform one into the other. This metric has nothing to 
do with parallel computing (presumably), but it has been studied as 
a model of evolutionary trees of DNA sequences. Sankoff [7] gives an 
exponential-time algorithm to find an optimal embedding. 


4 Algorithms for h and 

For definiteness, we now present explicit pseudocode for the l\ norm (fc- 
dimensional grids and hypercubes) and the two-dimensional loo norm (nine- 
point meshes). 

The input is the expression tree, represented by a node called root which 
is the top-level operation of the expression, and two node arrays left-child(x) 
and right-child(x) that give the two subexpressions combined at node x. 

The position of node i is jt(x), which is a fc-vector (xi(i), ... , Xfc(x)) 
of coordinates. Initially the positions of the leaves of the tree are given, on 
output, the positions of all the nodes have been filled in. 

The algorithm uses two recursive procedures find-opt , which fills in the 
Opt values from the leaves of the tree up to the root, and find-pos , which 
fills in the positions tt from the root down to the leaves. 

4.1 Grids and hypercubes 

This is the /i norm in k dimensions. In this case our algorithm reduces to 
a method first suggested by Farris [1] in the context of evolutionary trees. 
We present it herein a somewhat different form. 

The Opt values are generalized intervals. In the /j norm, these are just 
Jt-dimensional rectangles. Such a rectangle is represented by its minimum 
and maximum coordinate in each dimension. We compute the generalized 
intersection of two fc-dimensional rectangles, c = a n b, as follows. Each 
dimension of the generalized intersection is computed independently. One 
dimension of a rectangle is a closed interval on the real line; say dimension 
i of a is the interval a,- = ], and similarly for b and c. If a. and b, 
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overlap then c,- is just their intersection. If a; and 6,- do not overlap, then c,- 
is the closed interval lying between a, and 6,. (In every case, c,- is the second 
smallest of the four values {a~ ,af ,b~ ,b+}> and c+ is the second largest.) 
We omit the code for procedure general-int , which does this computation. 

In find-pos, we need to compute the closest point p of a generalized 
interval o to a given point *r(x). Again this done independently in each 
dimension. In dimension *, if 5r,(x) lies inside the interval (a" ,af], then 
Pi = jr ■»•(*). If 7 r,(x) lies outside the interval, then p, is the closer of a, and 
af to ir,(x). We omit the code for procedure closest-point, which does this 
computation. 

For brevity we assume that all the data structures are global, so the only 
parameter to the recursive calls is the current node of the tree. 


procedure optimal-evaluation ; 

/* Input is a tree represented by root, left-child y and right-child y 

and positions 7 r(x) for the leaves. Output is ir(x) for all nodes. / 
find-opt(root); 

ir(root) <— any point inOpt(roof); 
find-pos(root) 

end optimal-evaluation ; 


procedure find-opt( node x); 

f* Determine the Opt intervals for x and its descendants. */ 
if x is a leaf then 
Opt(x) ♦- {*(x)} 

else 

y *— left- child (x); 
z *— right-child(x)] 

find-opt(y); 

find-opt(z); 

Opt(x) <— general- int(Opt(y), Opt(z)) 
fi end find-opt; 


procedure find-pos ( node x); 

I* Determine the positions for all proper descendants of x. * / 
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j* ir(i) has already been computed. */ 
if x is a leaf then 

/* Do nothing */ 

else 

y *— left-child(x)] 

z «— right-child(x)] 

ir(y) «- c/osest-point(Opt(y),5r(x)); 

jr(z) 4 - c/osest-poini(Opt(z),jr(x)); 

find-pos(y); 

find-pos(z) 

fi end find-pos] 


As an example, consider the expression mentioned in Figure 1. Proce- 
dure find-opt will compute the three rectangles shown in Figure 2 as the Opt 
values for the three subexpressions. Rectangle C, from (25,60) to (30,80), 
is Opt( root), so the root position will be placed arbitrarily in that rectangle. 
Suppose the root position is placed at (25, 70). Then find-pos will place the 
node for u? © x at (25,80) and the node for y © z at (25,60). The total cost 

of the moves to those positions is 135. 

Procedures general-int and closest-point each take 0(k) time. Proce- 
dures find-opt and find-pos are each called once for each node in the tree. 
Therefore the entire computation takes 0(kn) time for an expression with 
n operands in k dimensions. This is Unear for a fixed-dimensional grid. 

4.2 Nine-point meshes 

A nine-point mesh in two dimensions represents the /«, metric. This metric 
is just a rotated and scaled version of the /j metric. Let A and B be the 



which are inverses of each other. Then the optimum loo positions can 
be found by applying the l x procedure optimal-evaluation to leaf positions 
r\x) = Ai r(x). The node positions *'(x) returned are then converted back 

to the original metric by ir(x) = Bn'(x). 

Again this whole procedure takes time linear in the number of operands 

of the expression. 
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Other variations 


5.1 Commutative and associative rearrangement: One di- 

mension 

For the Euclidean metric in one dimension, allowing commutative and as- 
sociative rearrangement does not make the problem harder. First, if all the 
operations in the expression are the same then the problem is trivial. 

Lemma 3 Suppose positions in one dimension are specified for the leaves 
x\ , . . . , x n of an expression consisting entirely of one kind of operation . If as- 
sociative and commutative rearrangements are allowed } the minimum cost of 
any embedding of any tree with those leaves is the difference between the po- 
sitions of the leftmost and rightmost leaves . In a minimum-cost embedding, 
the root may be placed anywhere in the closed interval between the leftmost 
and rightmost leaves . If the root is outside that interval by a distance of k, 
then the cost of the tree is at least k more than optimal 

Proof Suppose the leaves are numbered so that x(xi) < x(x 2 ) < ■ ■ • < 
x(x„). Given a position p such that x(x,*) < p < 7r(xj+i), we consider the 
tree corresponding to the parenthesization 

(• ■ ■ ((*1 + x 2 ) + * 3 ) + • • • + Xi ) + (*i+l + h (*n-2 + (*n— 1 + *n)) * * *)• 

We place the root at position p, assign each internal node of the left subtree 
to the same position as its right child, and assign each internal node of 
the right subtree to the same position as its left child. The cost of this 
embedding is x(z n ) — *(*i). 

This cost is clearly optimal, because some path in the tree must join 
x\ and x n . If the root is outside the interval [ir(x|),x(x n )] by k, one of 
the paths in the tree joining the root to zi or x n must have cost at least 
*(*„) — Jr(*i) + k. | 

If the expression has mor*. than one kind of operation in it, we can find a 
minimum-cost embedding in linear time by partitioning the given expression 
into connected subtrees with only one kind of operation. Then rearrange- 
ment is allowed within these homogeneous subtrees but not between them. 
We compute Opt(x) in postorder as usual, except that for each homogeneous 
subtree we use the lemma to find Opt(x) for its root x. Then we compute 
x in preorder, using the lemma to place the internal nodes of homogeneous 
subtrees. 
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5.2 Commutative and associative rearrangement: Higher di- 
mensions 

The problem becomes much harder in two or more dimensions if rearrange- 
ment is allowed. The two-dimensional problem in which all operations are 
the same is the Geometric Steiner TYee problem [3], which is NP-hard for 
the /i, l 3 , and /«, metrics. Many heuristics have been studied; Winter [8] 
give6 a survey. 

5.3 Specifying the position of the result 

Consider the variation in which the position of the final result is specified 
and rearrangement is not allowed. We reduce this to the free- root problem 
as follows: Suppose the given tree is T with root r, and the final result 
is constrained to have position p. Let T' be T, plus a new root r 1 whose 
children are r and a new leaf x\ In V we require the same leaf positions as 
in T, except that the root is unconstrained and we require x(x / ) = p. An 
optimal layout for T' gives an optimal layout for T. 
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